10 research outputs found

    Improved method for finding optimal formulae for bilinear maps in a finite field

    Get PDF
    In 2012, Barbulescu, Detrey, Estibals and Zimmermann proposed a new framework to exhaustively search for optimal formulae for evaluating bilinear maps, such as Strassen or Karatsuba formulae. The main contribution of this work is a new criterion to aggressively prune useless branches in the exhaustive search, thus leading to the computation of new optimal formulae, in particular for the short product modulo X 5 and the circulant product modulo (X 5 -- 1). Moreover , we are able to prove that there is essentially only one optimal decomposition of the product of 3 x 2 by 2 x 3 matrices up to the action of some group of automorphisms

    Fast integer multiplication using generalized Fermat primes

    Get PDF
    For almost 35 years, Sch{\"o}nhage-Strassen's algorithm has been the fastest algorithm known for multiplying integers, with a time complexity O(n Ă—\times log n Ă—\times log log n) for multiplying n-bit inputs. In 2007, F{\"u}rer proved that there exists K > 1 and an algorithm performing this operation in O(n Ă—\times log n Ă—\times K log n). Recent work by Harvey, van der Hoeven, and Lecerf showed that this complexity estimate can be improved in order to get K = 8, and conjecturally K = 4. Using an alternative algorithm, which relies on arithmetic modulo generalized Fermat primes, we obtain conjecturally the same result K = 4 via a careful complexity analysis in the deterministic multitape Turing model

    Parallel Integer Polynomial Multiplication

    Get PDF
    We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures. Complexity estimates and experimental comparisons demonstrate the advantages of this new approach

    Big Prime Field FFT on the GPU

    Get PDF
    International audienceWe consider prime fields of large characteristic, typically fitting on k machine words, where k is a power of 2. When the characteristic of these fields is restricted to a subclass of the generalized Fermat numbers, we show that arithmetic operations in such fields offer attractive performance both in terms of algebraic complexity and parallelism. In particular , these operations can be vectorized, leading to efficient implementation of fast Fourier transforms on graphics processing units

    Big Prime Field FFT on Multi-core Processors

    Get PDF
    International audienceWe report on a multi-threaded implementation of Fast Fourier Transforms over generalized Fermat prime fields. This work extends a previous study realized on graphics processing units to multi-core processors. In this new context, we overcome the less fine control of hardware resources by successively using FFT in support of the multiplication in those fields. We obtain favorable speedup factors (up to 6.9x on a 6-core, 12 threads node, and 4.3x on a 4-core, 8 threads node) of our parallel implementation compared to the serial implementation for the overall application thanks to the low memory footprint and the sharp control of arithmetic instructions of our implementation of generalized Fermat prime fields

    The Basic Polynomial Algebra Subprograms

    Get PDF
    International audienceThe Basic Polynomial Algebra Subprograms (BPAS) provides arithmetic operations (multiplication, division, root isolation, etc.) for univariate and multivariate polynomials over common types of coefficients (prime fields, complex rational numbers, rational functions, etc.). The code is mainly written in CilkPlus [10] targeting multicore processors. The current distribution focuses on dense polynomials and the sparse case is work in progress. A strong emphasis is put on adaptive algorithms as the library aims at supporting a wide variety of situations in terms of problem sizes and available computing resources. The BPAS library is publicly available in source at www.bpaslib.org

    Algorithmes de multiplication : complexité bilinéaire et méthodes asymptotiquement rapides

    No full text
    Since 1960 and the result of Karatsuba, we know that the complexity of the multiplication (of integers or polynomials) is sub-quadratic: given a ring R, the product in R[X] of polynomials a_0 + a_1 X and b_0 + b_1 X, for any a_0, a_1, b_0 and b_1 in R, can be computed with three and not four multiplications over R: (a_0 + a_1X)(b_0 + b_1X) = m_0 + (m_2 - m_0 - m_1)X + m_1X^2, with the three multiplications m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). In the same manner, Strassen's algorithm allows one to multiply two matrices 2nx2n with only seven products of matrices nxn. The two previous examples fall in the category of bilinear maps: these are functions of the form Phi : K^m x K^n -> K^l, given a field K, linear in each variable. Among the most classical bilinear maps, we have the multiplication of polynomials, matrices, or even elements of algebraic extension of finite fields. Given a bilinear map Phi, computing the minimal number of multiplications necessary to the evaluation of this map is a NP-hard problem. The purpose of this thesis is to propose algorithms minimizing this number of multiplications. Two angles of attack have been studied. The first aspect of this thesis is to study the problem of the computation of the bilinear complexity under the angle of the reformulation of this problem in terms of research of matrix subspaces of a given rank. This work led to an algorithm taking into account intrinsic properties of the considered products such as matrix or polynomial products over finite fields. This algorithm allows one to find all the possible decompositions, over F_2, for the product of polynomials modulo X^5 and the product of matrices 3x2 by 2x3. Another aspect of this thesis was the development of fast asymptotic methods for the integer multiplication. There is a particular family of algorithms that has been proposed after an article by Fürer published in 2007. This article proposed a first algorithm, relying on fast Fourier transform (FFT), allowing one to multiply n-bit integers in O(n log n 2^{O(log^* n)}), where log^* is the iterated logarithm function. In this thesis, an algorithm, relying on a number theoretical conjecture, has been proposed, involving the use of FFT and generalized Fermat primes. With a careful complexity analysis of this algorithm, we obtain a complexity in O(nlog n 4^{log^* n})Depuis 1960 et le résultat fondateur de Karatsuba, on sait que la complexité de la multiplication (d’entiers ou de polynômes) est sous-quadratique : étant donné un anneau R quelconque, le produit sur R[X] des polynômes a_0 + a_1 X et b_0 + b_1 X, pour tous a_0, a_1, b_0 et b_1 dans R, peut être calculé en seulement trois et non pas quatre multiplications sur R : (a_0 + a_1 X)(b_0 + b_1 X) = m_0 + (m_2 - m_0 - m_1)X + m_1 X^2, avec les trois produits m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). De la même manière, l’algorithme de Strassen permet de multiplier deux matrices 2nx2n en seulement sept produits de matrices nxn. Les deux exemples précédents tombent dans la catégorie des applications bilinéaires : des fonctions de la forme Phi : K^m x K^n -> K^l, pour un corps donné K, linéaires en chacune des deux variables. Parmi les applications bilinéaires les plus classiques, on trouve ainsi la multiplication de polynômes, de matrices, ou encore d’éléments d’extensions algébriques de corps finis. Étant donnée une application bilinéaire Phi, calculer le nombre minimal de multiplications nécessaires au calcul de cette application est un problème NP-difficile. L'objectif de cette thèse est de proposer des algorithmes minimisant ce nombre de multiplications. Deux angles d'attaques ont été suivis. Un premier aspect de cette thèse est l'étude du problème du calcul de la complexité bilinéaire sous l'angle de la reformulation de ce problème en termes de recherche de sous-espaces vectoriels de matrices de rang donné. Ce travail a donné lieu à un algorithme tenant compte de propriétés intrinsèques aux produits considérés tels que les produits matriciels ou polynomiaux sur des corps finis. Cet algorithme a permis de trouver toutes les décompositions possibles, sur F_2, pour le produit de polynômes modulo X^5 et le produit de matrices 3x2 par 2x3. Un autre aspect de ma thèse est celui du développement d’algorithmes asymptotiquement rapides pour la multiplication entière. Une famille particulière d'algorithmes récents ont été proposés suite à un article de Fürer publié en 2007, qui proposait un premier algorithme, reposant sur la transformée de Fourier rapide (FFT) permettant de multiplier des entiers de n bits en O(n log n 2^{O(log^* n)}), où log^* est la fonction logarithme itéré. Dans cette thèse, un algorithme dont la complexité dépend d'une conjecture de théorie des nombres est proposé, reposant sur la FFT et l'utilisation de premiers généralisés de Fermat. Une analyse de complexité permet d'obtenir une estimation en O(n log n 4^{log^* n}

    Multiplication algorithms : bilinear complexity and fast asymptotic methods

    No full text
    Depuis 1960 et le résultat fondateur de Karatsuba, on sait que la complexité de la multiplication (d’entiers ou de polynômes) est sous-quadratique : étant donné un anneau R quelconque, le produit sur R[X] des polynômes a_0 + a_1 X et b_0 + b_1 X, pour tous a_0, a_1, b_0 et b_1 dans R, peut être calculé en seulement trois et non pas quatre multiplications sur R : (a_0 + a_1 X)(b_0 + b_1 X) = m_0 + (m_2 - m_0 - m_1)X + m_1 X^2, avec les trois produits m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). De la même manière, l’algorithme de Strassen permet de multiplier deux matrices 2nx2n en seulement sept produits de matrices nxn. Les deux exemples précédents tombent dans la catégorie des applications bilinéaires : des fonctions de la forme Phi : K^m x K^n -> K^l, pour un corps donné K, linéaires en chacune des deux variables. Parmi les applications bilinéaires les plus classiques, on trouve ainsi la multiplication de polynômes, de matrices, ou encore d’éléments d’extensions algébriques de corps finis. Étant donnée une application bilinéaire Phi, calculer le nombre minimal de multiplications nécessaires au calcul de cette application est un problème NP-difficile. L'objectif de cette thèse est de proposer des algorithmes minimisant ce nombre de multiplications. Deux angles d'attaques ont été suivis. Un premier aspect de cette thèse est l'étude du problème du calcul de la complexité bilinéaire sous l'angle de la reformulation de ce problème en termes de recherche de sous-espaces vectoriels de matrices de rang donné. Ce travail a donné lieu à un algorithme tenant compte de propriétés intrinsèques aux produits considérés tels que les produits matriciels ou polynomiaux sur des corps finis. Cet algorithme a permis de trouver toutes les décompositions possibles, sur F_2, pour le produit de polynômes modulo X^5 et le produit de matrices 3x2 par 2x3. Un autre aspect de ma thèse est celui du développement d’algorithmes asymptotiquement rapides pour la multiplication entière. Une famille particulière d'algorithmes récents ont été proposés suite à un article de Fürer publié en 2007, qui proposait un premier algorithme, reposant sur la transformée de Fourier rapide (FFT) permettant de multiplier des entiers de n bits en O(n log n 2^{O(log^* n)}), où log^* est la fonction logarithme itéré. Dans cette thèse, un algorithme dont la complexité dépend d'une conjecture de théorie des nombres est proposé, reposant sur la FFT et l'utilisation de premiers généralisés de Fermat. Une analyse de complexité permet d'obtenir une estimation en O(n log n 4^{log^* n})Since 1960 and the result of Karatsuba, we know that the complexity of the multiplication (of integers or polynomials) is sub-quadratic: given a ring R, the product in R[X] of polynomials a_0 + a_1 X and b_0 + b_1 X, for any a_0, a_1, b_0 and b_1 in R, can be computed with three and not four multiplications over R: (a_0 + a_1X)(b_0 + b_1X) = m_0 + (m_2 - m_0 - m_1)X + m_1X^2, with the three multiplications m_0 = a_0b_0, m_1 = a_1b_1 et m_2 = (a_0 + a_1)(b_0 + b_1). In the same manner, Strassen's algorithm allows one to multiply two matrices 2nx2n with only seven products of matrices nxn. The two previous examples fall in the category of bilinear maps: these are functions of the form Phi : K^m x K^n -> K^l, given a field K, linear in each variable. Among the most classical bilinear maps, we have the multiplication of polynomials, matrices, or even elements of algebraic extension of finite fields. Given a bilinear map Phi, computing the minimal number of multiplications necessary to the evaluation of this map is a NP-hard problem. The purpose of this thesis is to propose algorithms minimizing this number of multiplications. Two angles of attack have been studied. The first aspect of this thesis is to study the problem of the computation of the bilinear complexity under the angle of the reformulation of this problem in terms of research of matrix subspaces of a given rank. This work led to an algorithm taking into account intrinsic properties of the considered products such as matrix or polynomial products over finite fields. This algorithm allows one to find all the possible decompositions, over F_2, for the product of polynomials modulo X^5 and the product of matrices 3x2 by 2x3. Another aspect of this thesis was the development of fast asymptotic methods for the integer multiplication. There is a particular family of algorithms that has been proposed after an article by Fürer published in 2007. This article proposed a first algorithm, relying on fast Fourier transform (FFT), allowing one to multiply n-bit integers in O(n log n 2^{O(log^* n)}), where log^* is the iterated logarithm function. In this thesis, an algorithm, relying on a number theoretical conjecture, has been proposed, involving the use of FFT and generalized Fermat primes. With a careful complexity analysis of this algorithm, we obtain a complexity in O(nlog n 4^{log^* n}

    Big Prime Field FFT on Multi-core Processors

    Get PDF
    International audienceWe report on a multi-threaded implementation of Fast Fourier Transforms over generalized Fermat prime fields. This work extends a previous study realized on graphics processing units to multi-core processors. In this new context, we overcome the less fine control of hardware resources by successively using FFT in support of the multiplication in those fields. We obtain favorable speedup factors (up to 6.9x on a 6-core, 12 threads node, and 4.3x on a 4-core, 8 threads node) of our parallel implementation compared to the serial implementation for the overall application thanks to the low memory footprint and the sharp control of arithmetic instructions of our implementation of generalized Fermat prime fields

    Basic Polynomial Algebra Subprograms

    No full text
    corecore